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This  report  covers  the  activities  of  John  R.  Rice  (Co-PI)  and  associates  at  Purdue 
University  from  May  1990  through  September  30,  1991.  The  three  year  funding  was 
terminated  early  and  work  stopped  by  September  30,  1991  (as  all  funds  had  been 
spent).  The  accomplishments  include  7  papers  published  in  or  submitted  to  technical 
journals,  10  conference  presentations  with  papers  in  the  conference  proceedings,  3  other 
technical  reports,  and  two  Ph.D.  theses. 

The  objective  of  this  work  was  to  explore  algorithms  and  their  implementation  for 
future  advanced  parallel  systems.  These  systems  are  assumed  to  have  hundreds  or  even 
thousands  of  processors  and  to  be  able  to  concentrate  their  computing  power  on  one  or 
a  small  number  of  tasks.  The  three  principal  questions  to  be  explored  were: 

1.  Are  there  algorithms  for  the  crucial  applications  which  have  enough  parallelism 
to  allow  the  power  of  the  advanced  parallel  systems  to  be  fully  exploited? 

2.  What  languages  and  implementation  tools  arc  needed  for  efficient  programming 
of  these  algorithms? 

3.  What  are  the  relative  performances  of  different  algorithm  types?  Of  different 
architecture  types?  Of  different  implementation  languages? 

The  research  results  obtained  are  grouped  within  four  areas,  basically  those 
described  in  the  original  proposal.  We  state  the  principal  problem  for  each  area  and 
then  list  the  papers,  conference  presentations,  theses  and  technical  reports  for  each  area, 
followed  by  a  short  summary  of  principal  or  typical  results  obtained. 

A.  ANALYSIS  OF  THE  PERFORMANCE  OF  FUTURE  COMPUTATIONS 

Principal  Problem :  Analyze  the  practicality  of  using  massive  parallelism  efficiently  in 
large  scale  scientific  and  engineering  computations. 

1.  D.C.  Marinescu  and  J.R.  Rice,  Synchronization  and  load  imbalance  effects  in 
distributed  memory  multi-processor  systems.  Concurrency  Practice  and  Experi¬ 
ence  3,  (1991),  593-625. 

2.  D.C.  Marinescu  and  J.R.  Rice,  On  high  level  characterization  of  parallelism,  J. 
Par.  Dist.  Comp.,  (1993),  to  appear. 
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3.  D.C.  Marinescu  and  J.R.  Rice,  The  effects  of  communication  and  latency  on  syn¬ 
chronization  and  dynamic  load  balance  on  a  hypercube,  Proc.  5th  Inti.  Parallel 
Processing  Symposium,  (Kumar,  ed.),  IEEE  Press,  (1991),  18-25. 

4.  D.C.  Marinescu,  C.E.  Houstis,  J.R.  Rice,  H.  Woidsrhmidt.  and  B. 
Waitshurper,  Distributed  supercomputing,  in  Future  trends  '90,  EEEE  P: 
(1990),  381-387. 

5.  Jin  Jing  and  J.R.  Rice,  Problems  to  test  parallel  and  vector  languages  -  II,  CSD- 
TR  1016,  Purdue  University,  CS  Department,  December  1990. 

B.  BENCHMARKING  EXISTING  COMPUTATIONS 

Principal  Problem :  Analyze  the  performance  of  existing  parallel  software  and 
machines.  Develop  methodology  for  benchmarking  the  performance  of  scientific  and 
engineering  software. 

6.  D.C.  Marinescu,  J.R.  Rice,  and  E.A.  Vavalis,  Performance  of  iterative  methods 
for  distributed  memory  machines,  Applied  Numerical  Mathematics  (1993),  to 
appear.  Extended  abstract  in  Proc.  13th  World  Congress,  EMACS,  Rutgers 
University,  New  Brunswick,  NJ,  Vol.  2  (1991),  684-685. 

7.  Mo  Mu  and  J.R.  Rice,  Performance  of  PDE  sparse  solvers  on  hypercubes,  in 
Unstructured  Scientific  Computations  on  Scalable  Multiprocessors  (J.  Saltz,  ed.), 
MIT  Press  (1992),  345-370. 

8.  Mo  Mu  and  J.R.  Rice,  A  PDE  sparse  solver  benchmark  for  massively  parallel 
distributed  memory  multiprocessors,  in  Computer  Methods  for  Partial  Differential 
Equations  VII  (R.  Vichnevetsky,  ed.),  IMACS,  New  Brunswick,  NJ  (1992). 

C.  CONTROL  OF  PARALLEL  COMPUTATIONS 

Principal  Problem:  Determine  how  to  break  computations  into  nearly  equally  sized 
pieces  to  distribute  to  a  collection  of  processors.  Determine  how  parallel  processors 
can  synchronize  and  organize  their  work  so  as  to  avoid  or  minimize  bottlenecks. 

9.  E.N.  Houstis,  S.K.  Kortesis,  and  H.  Byun,  A  workload  partitioning  strategy  for 
PDEs  by  a  generalized  neural  network.  CSD-TR-934,  Computer  Science  Depart¬ 
ment,  Purdue  University  (1990). 

10.  N.P.  Chrisochoides,  C.E.  Houstis,  and  E.N.  Houstis,  Geometry  based  mapping 
strategies  for  PDE  computations.  In  Supercomputing  91,  ACM  Press,  NY  (1991), 
115-127. 

11.  Mo  Mu  and  J.R.  Rice,  A  grid  based  subtree-subcube  assignment  strategy  for 
solving  PDEs  on  hypercubes,  SIAM  J.  Sci.  Stat.  Comp.,  13  (1992),  826-839. 

12.  N.P.  Chrisochoides,  C.E.  Houstis,  E.N.  Houstis,  S.K.  Kortesis,  P.N. 
Papachioi  and  J.R.  Rice,  Domain  decomposer:  A  software  tool  for  mapping 
PDE  computations  to  parallel  architectures.  In  Domain  Decomposition  Methods 
for  Partial  Differential  Equations,  SIAM  (1991),  341-357. 


13.  E.N.  Houstis  and  J.R.  Rice,  Parallel  ELLPACK:  A  development  and  problem 
solving  environment  for  high  performance  computing  machines.  In  Programming 
Environments  for  High-Level  Scientific  Problem  Solving  (P.  Gaffney  and  E. 
Houstis,  eds.),  North-Holland,  Amsterdam  (1992),  229-241. 

14.  N.P.  Chrisochoides  and  J.R.  Rice,  Partitioning  heuristics  for  PDE  computations 
based  on  parallel  hardware  and  geometry  characteristics.  In  Computer  Methods 
for  Partial  Differential  Equations  VII  (R.  Vichnevetsky,  ed.),  IMACS,  New 
Brunswick,  NJ  (1992). 

15.  N.P.  Chrisochoides,  On  the  Mapping  of  Partial  Differential  Equations  onto  Dis¬ 
tributed  Memory  MIMD  Parallel  Machines ,  Department  of  Computer  Sciences, 
Purdue  University,  Ph.D.  Thesis  (1992). 

D.  PARALLEL  ALGORITHMS  FOR  PHYSICAL  PROBLEMS 

Principal  Problem:  Create  algorithms  that  are  easily  broken  into  parallel  subcomputa¬ 
tions  and  whose  total  work  is  near  the  minimum  possible. 

16.  M.  Aboelaze,  N.P.  Chrisochoides,  E.N.  Houstis,  and  C.E.  Houstis,  Paralleliza¬ 
tion  of  level  2  and  3  BLAS  operations  on  distributed  memory  machines,  CAPO 
report  CER-91-04,  Computer  Sciences,  Purdue  University,  1991. 

17.  A.  Chen  and  J.R.  Rice,  On  grid  refinement  at  point  singularities  for  h-p  methods, 
Inti.  J.  Num.  Meth.  Engr.,  33,  (1992),  39-57. 

18.  Mo  Mu  and  J.R.  Rice,  Row  oriented  Gauss  elimination  on  distributed  memory 
multiprocessors.  Inti.  J.  High  Speed  Comp.  (1993),  to  appear. 

19.  A.  Hadjidimos,  E.N.  Houstis,  J.R.  Rice,  and  E.A.  Vavalis,  Iterative  line  cubic 
spline  collocation  methods  for  elliptic  partial  differential  equations  in  several 
dimensions,  SIAM  J.  Sci.  Stat.  Comp.  (1993),  to  appear. 

20.  H.S.  McFaddin  and  J.R.  Rice,  RELAX:  A  platform  for  software  relaxation.  In 
Expert  Systems  for  Scientific  Computing  (Houstis,  Rice,  and  Vichnevetsky,  eds.), 
North-Holland,  Amsterdam  (1992),  175-194. 

21.  N.P.  Chrisochoides,  E.N.  Houstis,  S.B.  Kim,  J.R.  Rice,  and  M.K.  Samartzis, 
Parallel  iterative  methods.  In  Computer  Methods  for  Partial  Differential  Equa¬ 
tions  VII  (R.  Vichnevetsky,  ed.),  IMACS,  New  Brunswick,  NJ  (1992). 

22.  H.S.  McFaddin,  An  Object  Based  Problem  Solving  Environment  for  Collaborat¬ 
ing  PDE  Solvers  and  Editors,  Department  of  Computer  Sciences,  Purdue  Univer¬ 
sity,  Ph.D.  Thesis  (1992). 
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